Fullerton
Simultaneous Heterogeneity and Reduced-rank Learning for Multivariate Response Regression
Wu, Jie, Zhang, Bo, Li, Daoji, Zheng, Zemin
Heterogeneous data are now ubiquitous in many applications in which correctly identifying the subgroups from a heterogeneous population is critical. Although there is an increasing body of literature on subgroup detection, existing methods mainly focus on the univariate response setting. In this paper, we propose a joint heterogeneity and reduced-rank learning framework to simultaneously identify the subgroup structure and estimate the covariate effects for heterogeneous multivariate response regression. In particular, our approach uses rank-constrained pairwise fusion penalization and conducts the subgroup analysis without requiring prior knowledge regarding the individual subgroup memberships. We implement the proposed approach by an alternating direction method of multipliers (ADMM) algorithm and show its convergence. We also establish the asymptotic properties for the resulting estimators under mild and interpretable conditions. A predictive information criterion is proposed to select the rank of the coefficient matrix with theoretical support. The effectiveness of the proposed approach is demonstrated through simulation studies and a real data application.
- Asia > China > Anhui Province > Hefei (0.04)
- North America > United States > California > Orange County > Fullerton (0.04)
- North America > United States > New York (0.04)
Enhancing Explainability in Solar Energetic Particle Event Prediction: A Global Feature Mapping Approach
Ji, Anli, Patil, Pranjal, Pandey, Chetraj, Georgoulis, Manolis K., Aydin, Berkay
In total, this dataset comprises 244 strong SEP events that clearly exceed the threshold of 10 pfu in the GOES P3 channel and 189 weak events observed in near-Earth space from 1986 to 2018. Additionally, the dataset includes time-series slices of GOES proton and X-ray fluxes for all the events, where each slice consists of a 12-hour observation window prior to the event onset time, and the peak flux period of events. A detailed description of dataset generation and available parameters can be found in [29]. B. Experimental Settings In supervised classification tasks, datasets with labeled samples are commonly divided into distinct subsets with knowledge of the included labels [8]. The extracted features are used to configure the parameters of the chosen algorithm in the training set, and the classifier's predictive performance on new data is determined using the testing set. Given our prediction task as a classification problem, we partition our dataset into two non-overlapping subsets: a training set (i.e., 996 samples) and a testing set (i.e., 922 samples). Similar but extending to the forecasting approach in [30], we explore the model capabilities for different short-term prediction windows of 6, 8, and 10 hours, as well as lag windows of 5, 15, 30, 45, 60, 120, and 180 minutes.
- Europe > Switzerland (0.05)
- North America > United States > Maryland > Prince George's County > Laurel (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > California > Orange County > Fullerton (0.04)
Measuring Algorithmic Partisanship via Zero-Shot Classification and Its Implications on Political Discourse
Amidst the rapid normalization of generative artificial intelligence (GAI), intelligent systems have come to dominate political discourse across information media. However, internalized political biases stemming from training data skews, human prejudice, and algorithmic flaws continue to plague this novel technology. This study employs a zero-shot classification approach to evaluate algorithmic political partisanship through a methodical combination of ideological alignment, topicality, response sentiment, and objectivity. A total of 1800 model responses across six mainstream large language models (LLMs) were individually input into four distinct fine-tuned classification algorithms, each responsible for computing one of the aforementioned metrics. The results show an amplified liberal-authoritarian alignment across the six LLMs evaluated, with notable instances of reasoning supersessions and canned refusals. The study subsequently highlights the psychological influences underpinning human-computer interactions and how intrinsic biases can permeate public discourse. The resulting distortion of the political landscape can ultimately manifest as conformity or polarization, depending on the region's pre-existing socio-political structures.
- Asia > China (0.16)
- North America > United States > California > Orange County > Fullerton (0.04)
- Europe > Russia (0.04)
- (12 more...)
PathFormer: A Transformer with 3D Grid Constraints for Digital Twin Robot-Arm Trajectory Generation
Alanazi, Ahmed, Ho, Duy, Lee, Yugyung
Robotic arms require precise, task-aware trajectory planning, yet sequence models that ignore motion structure often yield invalid or inefficient executions. We present a Path-based Transformer that encodes robot motion with a 3-grid (where/what/when) representation and constraint-masked decoding, enforcing lattice-adjacent moves and workspace bounds while reasoning over task graphs and action order. Trained on 53,755 trajectories (80% train / 20% validation), the model aligns closely with ground truth -- 89.44% stepwise accuracy, 93.32% precision, 89.44% recall, and 90.40% F1 -- with 99.99% of paths legal by construction. Compiled to motor primitives on an xArm Lite 6 with a depth-camera digital twin, it attains up to 97.5% reach and 92.5% pick success in controlled tests, and 86.7% end-to-end success across 60 language-specified tasks in cluttered scenes, absorbing slips and occlusions via local re-grounding without global re-planning. These results show that path-structured representations enable Transformers to generate accurate, reliable, and interpretable robot trajectories, bridging graph-based planning and sequence-based learning and providing a practical foundation for general-purpose manipulation and sim-to-real transfer.
- North America > United States > Missouri > Jackson County > Kansas City (0.14)
- North America > United States > California > Orange County > Fullerton (0.04)
PyramidStyler: Transformer-Based Neural Style Transfer with Pyramidal Positional Encoding and Reinforcement Learning
Durairaju, Raahul Krishna, Saruladha, K.
Neural Style Transfer (NST) has evolved from Gatys et al.'s (2015) CNN-based algorithm, enabling AI-driven artistic image synthesis. However, existing CNN and transformer-based models struggle to scale efficiently to complex styles and high-resolution inputs. We introduce PyramidStyler, a transformer framework with Pyramidal Positional Encoding (PPE): a hierarchical, multi-scale encoding that captures both local details and global context while reducing computational load. We further incorporate reinforcement learning to dynamically optimize stylization, accelerating convergence. Trained on Microsoft COCO and WikiArt, PyramidStyler reduces content loss by 62.6% (to 2.07) and style loss by 57.4% (to 0.86) after 4000 epochs--achieving 1.39 s inference--and yields further improvements (content 2.03; style 0.75) with minimal speed penalty (1.40 s) when using RL. These results demonstrate real-time, high-quality artistic rendering, with broad applications in media and design.
- Asia > India > Puducherry (0.04)
- North America > United States > California > Orange County > Fullerton (0.04)
Efficacy of Full-Packet Encryption in Mitigating Protocol Detection for Evasive Virtual Private Networks
Full-packet encryption is a technique used by modern evasive Virtual Private Networks (VPNs) to avoid protocol-based flagging from censorship models by disguising their traffic as random noise on the network. Traditional methods for censoring full-packet-encryption based VPN protocols requires assuming a substantial amount of collateral damage, as other non-VPN network traffic that appears random will be blocked. I tested several machine learning-based classification models against the Aggressive Circumvention of Censorship (ACC) protocol, a fully-encrypted evasive VPN protocol which merges strategies from a wide variety of currently in-use evasive VPN protocols. My testing found that while ACC was able to survive our models when compared to random noise, it was easily detectable with minimal collateral damage using several different machine learning models when within a stream of regular network traffic. While resistant to the current techniques deployed by nation-state censors, the ACC protocol and other evasive protocols are potentially subject to packet-based protocol identification utilizing similar classification models.
- Asia > Middle East > Iran (0.06)
- Asia > China (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (3 more...)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
Early Detection of At-Risk Students Using Machine Learning
Martinez, Azucena L. Jimenez, Sood, Kanika, Mahto, Rakeshkumar
This research presents preliminary work to address the challenge of identifying at-risk students using supervised machine learning and three unique data categories: engagement, demographics, and performance data collected from Fall 2023 using Canvas and the California State University, Fullerton dashboard. We aim to tackle the persistent challenges of higher education retention and student dropout rates by screening for at-risk students and building a high-risk identification system. By focusing on previously overlooked behavioral factors alongside traditional metrics, this work aims to address educational gaps, enhance student outcomes, and significantly boost student success across disciplines at the University. Pre-processing steps take place to establish a target variable, anonymize student information, manage missing data, and identify the most significant features. Given the mixed data types in the datasets and the binary classification nature of this study, this work considers several machine learning models, including Support Vector Machines (SVM), Naive Bayes, K-nearest neighbors (KNN), Decision Trees, Logistic Regression, and Random Forest. These models predict at-risk students and identify critical periods of the semester when student performance is most vulnerable. We will use validation techniques such as train test split and k-fold cross-validation to ensure the reliability of the models. Our analysis indicates that all algorithms generate an acceptable outcome for at-risk student predictions, while Naive Bayes performs best overall.
- North America > United States > California > Orange County > Fullerton (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > South Korea > Incheon > Incheon (0.04)
- Instructional Material (1.00)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)
Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks
Yue, Shengbin, Wang, Siyuan, Chen, Wei, Huang, Xuanjing, Wei, Zhongyu
Recent advancements in Large Language Models (LLMs) have led to significant breakthroughs in various natural language processing tasks. However, generating factually consistent responses in knowledge-intensive scenarios remains a challenge due to issues such as hallucination, difficulty in acquiring long-tailed knowledge, and limited memory expansion. This paper introduces SMART, a novel multi-agent framework that leverages external knowledge to enhance the interpretability and factual consistency of LLM-generated responses. SMART comprises four specialized agents, each performing a specific sub-trajectory action to navigate complex knowledge-intensive tasks. We propose a multi-agent co-training paradigm, Long- and Short-Trajectory Learning, which ensures synergistic collaboration among agents while maintaining fine-grained execution by each agent. Extensive experiments on 5 tasks demonstrate SMART's superior performance compared to previous widely adopted methods.
- Europe > United Kingdom (0.14)
- North America > United States > Nebraska (0.04)
- North America > United States > Missouri (0.04)
- (13 more...)
Has Sentiment Returned to the Pre-pandemic Level? A Sentiment Analysis Using U.S. College Subreddit Data from 2019 to 2022
As impact of COVID-19 pandemic winds down, both individuals and society gradually return to pre-pandemic activities. This study aims to explore how people's emotions have changed from the pre-pandemic during the pandemic to post-emergency period and whether it has returned to pre-pandemic level. We collected Reddit data in 2019 (pre-pandemic), 2020 (peak pandemic), 2021, and 2022 (late stages of pandemic, transitioning period to post-emergency period) from subreddits in 128 universities/colleges in the U.S., and a set of school-level characteristics. We predicted two sets of sentiments from a pre-trained Robustly Optimized BERT pre-training approach (RoBERTa) and graph attention network (GAT) that leverages both rich semantic and relational information among posted messages and then applied a logistic stacking method to obtain the final sentiment classification. After obtaining sentiment label for each message, we used a generalized linear mixed-effects model to estimate temporal trend in sentiment from 2019 to 2022 and how school-level factors may affect sentiment. Compared to the year 2019, the odds of negative sentiment in years 2020, 2021, and 2022 are 24%, 4.3%, and 10.3% higher, respectively, which are all statistically significant(adjusted $p$<0.05). Our study findings suggest a partial recovery in the sentiment composition in the post-pandemic-emergency era. The results align with common expectations and provide a detailed quantification of how sentiments have evolved from 2019 to 2022.
- North America > United States > Florida > Hillsborough County > University (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > North Carolina (0.04)
- (49 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Education > Educational Setting > Higher Education (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Leveraging Explainable AI to Analyze Researchers' Aspect-Based Sentiment about ChatGPT
Lakhanpal, Shilpa, Gupta, Ajay, Agrawal, Rajeev
The groundbreaking invention of ChatGPT has triggered enormous discussion among users across all fields and domains. Among celebration around its various advantages, questions have been raised with regards to its correctness and ethics of its use. Efforts are already underway towards capturing user sentiments around it. But it begs the question as to how the research community is analyzing ChatGPT with regards to various aspects of its usage. It is this sentiment of the researchers that we analyze in our work. Since Aspect-Based Sentiment Analysis has usually only been applied on a few datasets, it gives limited success and that too only on short text data. We propose a methodology that uses Explainable AI to facilitate such analysis on research data. Our technique presents valuable insights into extending the state of the art of Aspect-Based Sentiment Analysis on newer datasets, where such analysis is not hampered by the length of the text data.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > United States > Michigan > Kalamazoo County > Kalamazoo (0.04)
- (6 more...)
- Education (1.00)
- Information Technology > Services (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)